Privacy-Preserving Text Indexing for Search of Documents
نویسنده
چکیده
Protection of content of sensitive text documents is important in enterprise intranets. An index structure is needed to support efficient search and retrieval, but it can lead to information leakage; by statistical attacks an adversary can draw probabilistic inference about the contents of document collection. Zerr and others present a confidential index structure and the ranking of retrieved documents for the query, but only for singleterm queries. The solution proposed in the paper generalizes Zerr’s method by using an anonymization parameter and query-dependent anonymized inverse document frequency factors; thereby it provides better ranking and gives possibility of multi-term queries. Key-Words: information retrieval, confidential text indexing, inverted index, posting list, inverse document frequency, r-confidentiality, anonymization, ranking
منابع مشابه
A Comparing between the impacts of text based indexing and folksonomy on ranking of images search via Google search engine
Background and Aim: The purpose of this study was to compare the impact of text based indexing and folksonomy in image retrieval via Google search engine. Methods: This study used experimental method. The sample is 30 images extracted from the book “Gray anatomy”. The research was carried out in 4 stages; in the first stage, images were uploaded to an “Instagram” account so the images are tagge...
متن کاملPrivacy preserving document indexing infrastructure for a distributed environment
To carry out work assignments, small groups distributed within a larger enterprise or collaborative community often need to share documents among themselves while shielding those documents from others’ eyes. In this situation, users need an indexing facility that can quickly locate relevant documents that they are allowed to access, without (1) leaking information about the remaining documents,...
متن کاملSpatio-textual Indexing for Geographical Search on the Web
Many web documents refer to specific geographic localities and many people include geographic context in queries to web search engines. Standard web search engines treat the geographical terms in the same way as other terms. This can result in failure to find relevant documents that refer to the place of interest using alternative related names, such as those of included or nearby places. This ...
متن کاملAn Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کاملAn Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012